Column-oriented DBMS
   HOME

TheInfoList



OR:

A column-oriented DBMS or columnar DBMS is a
database management system In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases span ...
(DBMS) that stores data tables by
column A column or pillar in architecture and structural engineering is a structural element that transmits, through compression, the weight of the structure above to other structural elements below. In other words, a column is a compression member. ...
rather than by row. Benefits include more efficient access to data when only querying a subset of columns (by eliminating the need to read columns that are not relevant), and more options for data compression. However, they are typically less efficient for inserting new data. Practical use of a column store versus a row store differs little in the
relational DBMS A relational database is a (most commonly digital) database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relation ...
world. Both columnar and row databases can use traditional database query languages like SQL to load data and perform queries. Both row and columnar databases can become the backbone in a system to serve data for common
extract, transform, load In computing, extract, transform, load (ETL) is a three-phase process where data is extracted, transformed (cleaned, sanitized, scrubbed) and loaded into an output data container. The data can be collated from one or more sources and it can also ...
(ETL) and tools.


Description


Background

A relational database management system provides data that represents a two-dimensional table of columns and rows. For example, a database might have this table: This simple table includes an employee identifier (EmpId), name fields (Lastname and Firstname) and a salary (Salary). This two-dimensional format is an abstraction. In an actual implementation, storage hardware requires the data to be serialized into one form or another. The most expensive operations involving
hard disk A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magnet ...
s are
seeks Seeks is a free and open-source project licensed under the GNU Affero General Public License version 3 (AGPL-3.0-or-later). It exists to create an alternative to the current market-leading search engines, driven by user concerns rather than cor ...
. In order to improve overall performance, related data should be stored in a fashion to minimize the number of seeks. This is known as
locality of reference In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
, and the basic concept appears in a number of different contexts. Hard disks are organized into a series of blocks of a fixed size, typically enough to store several rows of the table. By organizing the table's data so rows fit within these blocks, and grouping related rows onto sequential blocks, the number of blocks that need to be read or sought is minimized in many cases, along with the number of seeks. A survey by Pinnecke et al. covers techniques for column-/row hybridization as of 2017.


Row-oriented systems

A common method of storing a table is to serialize each row of data, like this: 001:10,Smith,Joe,60000; 002:12,Jones,Mary,80000; 003:11,Johnson,Cathy,94000; 004:22,Jones,Bob,55000; As data is inserted into the table, it is assigned an internal ID, the rowid that is used internally in the system to refer to data. In this case the records have sequential rowids independent of the user-assigned empid. In this example, the DBMS uses short integers to store rowids. In practice, larger numbers, 64-bit or 128-bit, are normally used. Row-oriented systems are designed to efficiently return data for an entire row, or record, in as few operations as possible. This matches the common use-case where the system is attempting to retrieve information about a particular object, say the contact information for a user in a
rolodex A Rolodex is a rotating card file device used to store business contact information. Its name, a portmanteau of the words ''rolling'' and ''index'', has become somewhat genericized (usually as ''rolodex'') for any personal organizer performing th ...
system, or product information for an
online shopping Online shopping is a form of electronic commerce which allows consumers to directly buy goods or services from a seller over the Internet using a web browser or a mobile app. Consumers find a product of interest by visiting the website of the r ...
system. By storing the record's data in a single block on the disk, along with related records, the system can quickly retrieve records with a minimum of disk operations. Row-oriented systems are not efficient at performing set-wide operations on the whole table, as opposed to a small number of specific records. For instance, in order to find all records in the example table with salaries between 40,000 and 50,000, the DBMS would have to fully scan through the entire table looking for matching records. While the example table shown above will likely fit in a single disk block, a table with even a few hundred rows would not, and multiple disk operations would be needed to retrieve the data and examine it. To improve the performance of these sorts of operations (which are very common, and generally the point of using a DBMS), most DBMSs support the use of
database index A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure. Indexes are used to quickly locate data without ...
es, which store all the values from a set of columns along with rowid pointers back into the original table. An index on the salary column would look something like this: 55000:004; 60000:001; 80000:002; 94000:003; As they store only single pieces of data, rather than entire rows, indexes are generally much smaller than the main table stores. Scanning this smaller set of data reduces the number of disk operations. If the index is heavily used, it can dramatically reduce the time for common operations. However, maintaining indexes adds overhead to the system, especially when new data is written to the database. Records not only need to be stored in the main table, but any attached indexes have to be updated as well. The main reason why indexes dramatically improve performance on large datasets is that database indexes on one or more columns are typically sorted by value, which makes range queries operations (like the above "find all records with salaries between 40,000 and 50,000" example) very fast (lower time-complexity). A number of row-oriented databases are designed to fit entirely in
RAM Ram, ram, or RAM may refer to: Animals * A male sheep * Ram cichlid, a freshwater tropical fish People * Ram (given name) * Ram (surname) * Ram (director) (Ramsubramaniam), an Indian Tamil film director * RAM (musician) (born 1974), Dutch * ...
, an
in-memory database An in-memory database (IMDB, or main memory database system (MMDB) or memory resident database) is a database management system that primarily relies on main memory for computer data storage. It is contrasted with database management systems that e ...
. These systems do not depend on disk operations, and have equal-time access to the entire dataset. This reduces the need for indexes, as it requires the same amount of operations to fully scan the original data as a complete index for typical aggregation purposes. Such systems may be therefore simpler and smaller, but can only manage databases that will fit in memory.


Column-oriented systems

A column-oriented database serializes all of the values of a column together, then the values of the next column, and so on. For our example table, the data would be stored in this fashion: 10:001,12:002,11:003,22:004; Smith:001,Jones:002,Johnson:003,Jones:004; Joe:001,Mary:002,Cathy:003,Bob:004; 60000:001,80000:002,94000:003,55000:004; In this layout, any one of the columns more closely matches the structure of an index in a row-oriented system. This may cause confusion that can lead to the mistaken belief a column-oriented store "is really just" a row-store with an index on every column. However, it is the mapping of the data that differs dramatically. In a row-oriented system, indices map column values to rowids, whereas in a column-oriented system, columns map rowids to column values. This may seem subtle, but the difference can be seen in this common modification to the same store wherein the two "Jones" items, above, are compressed into a single item with two rowids: …;Smith:001;Jones:002,004;Johnson:003;… Whether or not a column-oriented system will be more efficient in operation depends heavily on the workload being automated. Operations that retrieve all the data for a given object (the entire row) are slower. A row-oriented system can retrieve the row in a single disk read, whereas numerous disk operations to collect data from multiple columns are required from a columnar database. However, these whole-row operations are generally rare. In the majority of cases, only a limited subset of data is retrieved. In a rolodex application, for instance, collecting the first and last names from many rows to build a list of contacts is far more common than reading all data for any single address. This is even more true for writing data into the database, especially if the data tends to be "sparse" with many optional columns. For this reason, column stores have demonstrated excellent real-world performance in spite of many theoretical disadvantages. Partitioning, indexing, caching, views,
OLAP cube An OLAP cube is a multi-dimensional array of data. Online analytical processing (OLAP) is a computer-based technique of analyzing data to look for insights. The term ''cube'' here refers to a multi-dimensional dataset, which is also sometimes ca ...
s, and transactional systems such as
write-ahead logging In computer science, write-ahead logging (WAL) is a family of techniques for providing atomicity and durability (two of the ACID properties) in database systems. A write ahead log is an append-only auxiliary disk-resident structure used for crash ...
or
multiversion concurrency control Multiversion concurrency control (MCC or MVCC), is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory. Description W ...
all dramatically affect the physical organization of either system. That said,
online transaction processing In online transaction processing (OLTP), information systems typically facilitate and manage transaction-oriented applications. This is contrasted with online analytical processing. The term "transaction" can have two different meanings, both of wh ...
(OLTP)-focused RDBMS systems are more row-oriented, while
online analytical processing Online analytical processing, or OLAP (), is an approach to answer multi-dimensional analytical (MDA) queries swiftly in computing. OLAP is part of the broader category of business intelligence, which also encompasses relational databases, repo ...
(OLAP)-focused systems are a balance of row-oriented and column-oriented.


Benefits


Access time

Comparisons between row-oriented and column-oriented databases are typically concerned with the efficiency of hard-disk access for a given workload, as
seek time Higher performance in hard disk drives comes from devices which have better performance characteristics. These performance characteristics can be grouped into two categories: access time and data transfer time (or rate). Access time The ''access ...
is incredibly long compared to the other bottlenecks in computers. For example, a typical
Serial ATA SATA (Serial AT Attachment) is a computer bus interface that connects host bus adapters to mass storage devices such as hard disk drives, optical drives, and solid-state drives. Serial ATA succeeded the earlier Parallel ATA (PATA) standard t ...
(SATA) hard drive has an average seek time of between 16 and 22 milliseconds while DRAM access on an Intel Core i7 processor takes on average 60 nanoseconds, nearly 400,000 times as fast. Clearly, disk access is a major bottleneck in handling big data. Columnar databases boost performance by reducing the amount of data that needs to be read from disk, both by efficiently compressing the similar columnar data and by reading only the data necessary to answer the query. In practice, columnar databases are well-suited for
OLAP Online analytical processing, or OLAP (), is an approach to answer multi-dimensional analytical (MDA) queries swiftly in computing. OLAP is part of the broader category of business intelligence, which also encompasses relational databases, repor ...
-like workloads (e.g.,
data warehouse In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for Business reporting, reporting and data analysis and is considered a core component of business intelligence. DWs are central Repos ...
s) which typically involve highly complex queries over all data (possibly
petabyte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
s). However, some work must be done to write data into a columnar database. Transactions (INSERTs) must be separated into columns and compressed as they are stored, making it less suited for OLTP workloads. Row-oriented databases are well-suited for
OLTP In online transaction processing (OLTP), information systems typically facilitate and manage transaction-oriented applications. This is contrasted with online analytical processing. The term "transaction" can have two different meanings, both of w ...
-like workloads which are more heavily loaded with interactive transactions. For example, retrieving all data from a single row is more efficient when that data is located in a single location (minimizing disk seeks), as in row-oriented architectures. However, column-oriented systems have been developed as hybrids capable of both OLTP and OLAP operations. Some of the OLTP constraints, faced by such column-oriented systems, are mediated using (amongst other qualities) in-memory data storage. Column-oriented systems suitable for both OLAP and OLTP roles effectively reduce the total data footprint by removing the need for separate systems.


Compression

Column data is of uniform type; therefore, there are some opportunities for storage size optimizations available in column-oriented data that are not available in row-oriented data. For example, many popular modern compression schemes, such as LZW or
run-length encoding Run-length encoding (RLE) is a form of lossless data compression in which ''runs'' of data (sequences in which the same data value occurs in many consecutive data elements) are stored as a single data value and count, rather than as the original ...
, make use of the similarity of adjacent data to compress. Missing values and repeated values, common in clinical data, can be represented by a two-bit marker. While the same techniques may be used on row-oriented data, a typical implementation will achieve less effective results. To improve compression, sorting rows can also help. For example, using bitmap indexes, sorting can improve compression by an order of magnitude. To maximize the compression benefits of the
lexicographical order In mathematics, the lexicographic or lexicographical order (also known as lexical order, or dictionary order) is a generalization of the alphabetical order of the dictionaries to sequences of ordered symbols or, more generally, of elements of a ...
with respect to
run-length encoding Run-length encoding (RLE) is a form of lossless data compression in which ''runs'' of data (sequences in which the same data value occurs in many consecutive data elements) are stored as a single data value and count, rather than as the original ...
, it is best to use low-
cardinality In mathematics, the cardinality of a set is a measure of the number of elements of the set. For example, the set A = \ contains 3 elements, and therefore A has a cardinality of 3. Beginning in the late 19th century, this concept was generalized ...
columns as the first sort keys. For example, given a table with columns sex, age, name, it would be best to sort first on the value sex (cardinality of two), then age (cardinality of <128), then name. Columnar compression achieves a reduction in disk space at the expense of efficiency of retrieval. The greater adjacent compression achieved, the more difficult random-access may become, as data might need to be uncompressed to be read. Therefore, column-oriented architectures are sometimes enriched by additional mechanisms aimed at minimizing the need for access to compressed data.


History

Column stores or transposed files have been implemented from the early days of DBMS development. TAXIR was the first application of a column-oriented database storage system with focus on information-retrieval in biology in 1969. Clinical data from patient records with many more attributes than could be analyzed were processed in 1975 and after by a time-oriented database system (TODS).
Statistics Canada Statistics Canada (StatCan; french: Statistique Canada), formed in 1971, is the agency of the Government of Canada commissioned with producing statistics to help better understand Canada, its population, resources, economy, society, and cultur ...
implemented the RAPID system in 1976 and used it for processing and retrieval of the Canadian Census of Population and Housing as well as several other statistical applications. RAPID was shared with other statistical organizations throughout the world and used widely in the 1980s. It continued to be used by Statistics Canada until the 1990s. Another column-oriented database was SCSS. Later column-oriented database packages included: * 1993: KDB * 1995: Sybase IQ Since about 2004 there have been additional open source and commercial implementations.
MonetDB MonetDB is an open-source column-oriented relational database management system (RDBMS) originally developed at the Centrum Wiskunde & Informatica (CWI) in the Netherlands. It is designed to provide high performance on complex queries against lar ...
was released under an
open-source license An open-source license is a type of license for computer software and other products that allows the source code, blueprint or design to be used, modified and/or shared under defined terms and conditions. This allows end users and commercial compa ...
on September 30, 2004, followed closely by the now defunct
C-Store C-Store is a database management system (DBMS) based on a column-oriented DBMS developed by a team at Brown University, Brandeis University, Massachusetts Institute of Technology and the University of Massachusetts Boston including Michael Stonebr ...
. C-store was a university project that eventually, with team member
Michael Stonebraker Michael Ralph Stonebraker (born October 11, 1943) is a computer scientist specializing in database systems. Through a series of academic prototypes and commercial startups, Stonebraker's research and products are central to many relational databa ...
staying on, led to
Vertica Vertica Systems is an analytic database management software company. Vertica was founded in 2005 by the database researcher Michael Stonebraker, with Andrew Palmer as the founding CEO. Ralph Breslauer and Christopher P. Lynch served as later ...
, which he co-founded in 2005. The MonetDB-related X100 project evolved into
VectorWise Actian Vector (formerly known as VectorWise) is an SQL relational database management system designed for high performance in analytical database applications. It published record breaking results on the Transaction Processing Performance Counc ...
.
Druid A druid was a member of the high-ranking class in ancient Celtic cultures. Druids were religious leaders as well as legal authorities, adjudicators, lorekeepers, medical professionals and political advisors. Druids left no written accounts. Whi ...
is a column-oriented data store that was open-sourced in late 2012 and is now used by numerous organizations. Classic
Relational DBMS A relational database is a (most commonly digital) database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relation ...
can use column-oriented strategies by mixing row-oriented and column-oriented tables. Despite the DBMS complexity, this approach has proven to be valuable from the years 2010 to present. For example in 2014 Citusdata introduced column-oriented tables for
PostgreSQL PostgreSQL (, ), also known as Postgres, is a free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. It was originally named POSTGRES, referring to its origins as a successor to the In ...
and McObject added support for columnar storage with its release of
eXtremeDB eXtremeDB is a high performance, low-latency, ACID-compliant embedded database management system using an in-memory database system (IMDS) architecture and designed to be linked into C/ C++ based programs. It works on Windows, Linux, and other ...
Financial Edition in 2012 which was then used to establish a new standard of performance for the independently audited STAC-M3 benchmark.


See also

*
Data warehouse In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for Business reporting, reporting and data analysis and is considered a core component of business intelligence. DWs are central Repos ...
*
List of column-oriented DBMSes This article is a list of Column-oriented DBMS, column-oriented database management system software. Free and open-source software (FOSS) Platform as a Service (PaaS) *Amazon Redshift * Microsoft Azure SQL Data Warehouse * Google BigQuery ...
* AOS and SOA *
RCFile Within computing database management systems, the RCFile (Record Columnar File) is a data placement structure that determines how to store relational tables on computer clusters. It is designed for systems using the MapReduce framework. The RCFile ...
*
BigQuery BigQuery is a fully managed, serverless data warehouse that enables scalable analysis over petabytes of data. It is a ''Platform as a Service'' (PaaS) that supports querying using ANSI SQL. It also has built-in machine learning capabilities. Big ...


References


External links


Distinguishing Two Major Types Of Column-Stores

VLDB 2009 Tutorial
- overview


Weaving Relations for Cache Performance
- column-oriented block layout
The Design and Implementation of Modern Column-Oriented Database Systems
{{DEFAULTSORT:Column-Oriented Dbms Database management systems Database models Types of databases